Since FiveThirtyEight recently shared access to a lot of their data, I thought it would be fun to download and analyze data from one of their blog posts in R. I remember reading this article on scientists’ donations to political campaigns, and since the original blog post does not feature any visualizations, I thought I’d come up with some of my own.
Here are the packages I will be using:
library(RCurl)
library(reshape)
library(dplyr)
library(ggplot2)
library(stringr)
library(rebus)
library(maps)
library(leaflet)
library(tigris)
library(purrr)
library(knitr)
library(kableExtra)
link = "https://media.githubusercontent.com/media/fivethirtyeight/data/master/science-giving/science_federal_giving.csv"
data = read.csv(text = getURL(link),stringsAsFactors = FALSE)
### Clean up missingness in the data
data_cleaned = data %>%
mutate_all(funs(ifelse(. %in% c("","NULL"),NA,.)))
president_funding = data_cleaned %>%
mutate(transaction_amt = as.numeric(transaction_amt)) %>%
filter(cand_pty_affiliation %in% c("DEM","REP","LIB","IND"),
cycle == "2016",
cand_office == "P",
cand_status == "C") %>%
group_by(state,cand_pty_affiliation,cand_name,classification) %>%
summarise(money = sum(transaction_amt,na.rm=TRUE))
### See the top 10 candidates who received funding from scientists by classification
top_10_funded = president_funding %>%
group_by(cand_name) %>%
summarise(total_donations = sum(money)) %>%
ungroup() %>%
top_n(10,wt = total_donations)
president_funding %>%
group_by(cand_name,classification) %>%
summarise(mon = sum(money)/1000000) %>%
right_join(top_10_funded,by="cand_name")%>%
ggplot()+
geom_bar(aes(x = reorder(cand_name,mon),y=mon,fill=classification),stat='identity') +
coord_flip() +
scale_y_continuous(expand=c(0,0)) +
theme(legend.position = c(0.8, 0.15),legend.background = element_rect(fill="white")) +
labs(title="Campaign Donations by Scientists to 2016 Presidential Candidates", x = "Presidential Candidate",y = "Total Donations Received ($, millions)")
The FiveThirtyEight article noted that scientists were more willing to show support to candidates with scientific backgrounds. Just for validation, we can look at a list of the ten Congressional Candidates in 2016 who received the most donations from scientists:
data_cleaned %>%
mutate(transaction_amt = as.numeric(transaction_amt),
district = paste(cand_office_st,cand_office_district,sep="-")) %>%
filter(cand_pty_affiliation %in% c("DEM","REP"),
cycle == "2016",
cand_office == "H",
cand_status == "C")%>%
group_by(district,cand_pty_affiliation,cand_name) %>%
summarise(donations = sum(transaction_amt, na.rm=TRUE)) %>% #,
# pct_out_of_state = sum(cand_office_st != state, na.rm = TRUE)/n() * 100)
ungroup() %>%
arrange(desc(donations)) %>%
head(10) %>%
kable("html") %>%
kable_styling()
| district | cand_pty_affiliation | cand_name | donations |
|---|---|---|---|
| IL-11 | DEM | FOSTER, G. WILLIAM (BILL) | 319025 |
| PA-8 | DEM | NAUGHTON, SHAUGHNESSY | 158806 |
| CA-17 | DEM | KHANNA, ROHIT | 146441 |
| FL-9 | DEM | GRAYSON, DENA MD, PHD | 95300 |
| IL-8 | DEM | KRISHNAMOORTHI, S. RAJA | 77108 |
| FL-23 | DEM | CANOVA, TIMOTHY A. | 70696 |
| NJ-1 | DEM | NORCROSS, DONALD W | 63573 |
| NY-19 | DEM | TEACHOUT, ZEPHYR | 63219 |
| NY-23 | DEM | PLUMB, JOHN | 56567 |
| CA-9 | DEM | MCNERNEY, JERRY | 54424 |
Of the list above, the two with the greatest amount of donations from scientists were Bill Foster, a former physicist, and Shaughnessy Naughton, a former drug researcher, both of whom were referenced in the original article. Also on the list were Dr. Dena Grayson (MD, PhD), Raja Krishnamoorthi (Bachelor’s degree in Mechanical Engineering), John Plumb (MS in Physics), and Jerry McNerney (PhD in Mathematics). While not all of these candidates won their respective races, several continue to dedicate time and resources to encouraging other scientists to run for office.
Next, I
all_funding_data = readRDS("all_funding_data.RDS")
congress_funding = data_cleaned %>%
mutate(transaction_amt = as.numeric(transaction_amt)) %>%
filter(cand_pty_affiliation %in% c("DEM","REP"),
cycle == "2016",
cand_office == "H",
cand_status == "C")%>%
group_by(cand_office_st,cand_pty_affiliation,cand_office_district) %>%
summarise(money = sum(transaction_amt,na.rm=TRUE)) %>%
cast(cand_office_st + cand_office_district ~ cand_pty_affiliation, value="money") %>%
filter(!is.na(cand_office_district)) %>%
mutate_all(funs(ifelse(is.na(.),0,.))) %>%
mutate(party_diff = DEM - REP) %>% #,
#cand_office_district = ifelse(cand_office_district == 0, 1, cand_office_district)) %>%
left_join(state.fips %>%
select(cand_office_st = abb, fips) %>%
filter(!duplicated(fips)),by="cand_office_st") %>%
mutate(fips = ifelse(nchar(fips)== 1, paste0("0",fips),fips),
cand_office_district = ifelse(nchar(cand_office_district)== 1, paste0("0",cand_office_district),cand_office_district),
ID = paste(fips,cand_office_district,sep="_"),
cand_office_district = str_replace(cand_office_district,"00","01"),
ID_name = paste0(cand_office_st,cand_office_district)) %>%
left_join(all_funding_data,by=c("cand_office_st","cand_office_district")) %>%
rowwise() %>%
mutate(dem_pct = ifelse(!D == 0, DEM/D * 100, 0),
rep_pct = ifelse(!R == 0, REP/R * 100, 0),
partisan_diff = sum(dem_pct,-rep_pct,na.rm=TRUE))
While it may be interesting to look at the candidates who received the most donations from scientists, I was also really curious to determine where donations from scientists were most ``impactful." By combining data from FiveThirtyEight with information on the total amount of money raised by each Congressional Candidate in the 2016 cycle, I wanted to see what percentage of the donations by political party and congressional district came from scientists.
| district | percent | winner |
|---|---|---|
| FL-04 | 17.54185 | John Rutherford (R) |
| LA-03 | 16.60414 | Clay Higgins (R) |
| TN-03 | 14.14330 | Chuck Fleischmann (R) |
| FL-09 | 13.79385 | Darren Soto (D) |
| IL-11 | 11.05217 | Bill Foster (D) |
| district | percent | winner |
|---|---|---|
| CA-11 | 67.10675 | Mark Desaulnier (D) |
| WA-01 | 25.67068 | Suzan DelBene (D) |
| TX-15 | 16.93235 | Vicente Gonzalez (D) |
| CA-18 | 14.04920 | Anna Eshoo (D) |
| FL-10 | 11.76984 | Val Demings (D) |